Cost-sensitive Global Model Trees applied to loan charge-off forecasting

نویسندگان

  • Marcin Czajkowski
  • Monika Czerwonka
  • Marek Kretowski
چکیده

a r t i c l e i n f o Keywords: Cost-sensitive regression Model trees Evolutionary algorithms Asymmetric costs Loan charge-off forecasting Regression learning methods in real world applications often require cost minimization instead of the reduction of various metrics of prediction errors. Currently in the literature, there is a lack of white box solutions that can deal with forecasting problems where under-prediction and over-prediction errors have different consequences. To fill this gap, we introduced the Cost-sensitive Global Model Tree (CGMT), which applies a fitness function that minimizes an average misprediction cost. Proposed specialized genetic operators improve searching for optimal tree structure and cost-sensitive linear regression models in the leaves. Experimental validation is performed on loan charge-off data. It is known to be a difficult forecasting problem for banks due to the asymmetric cost structure. Obtained results show that specialized evolutionary algorithm applied to model tree induction finds significantly more accurate predictions than tested competitors. Decisions generated by the CGMT are simple, easy to interpret, and can be applied directly. A lot of real world problems are cost-sensitive, which means that different types of prediction errors are not equally costly [5]. As a result, typical minimization of prediction errors is not the best scenario. A cost-sensitive term encompasses all types of learning where cost is considered [50,19], and different types of costs (e.g., costs of attributes, cost of instances, and costs of errors) can be distinguished. The current research focuses on a single cost for decision making, however, multiple costs are also investigated [35]. For example, in medical diagnoses, there are several types of costs that can be minimized, such as the cost of misclassification (e.g., overlooking an ill patient can be fatal in contrast to a false positive test) or the cost of treatment (e.g., financial or risk). When speculating on stock exchange, investors directly compare future gains and losses and usually give more weight to losses. Researchers show that potential gains need to be approximately twice as large to offset potential losses [51]. As a consequence, investors tend to realize their gains more often than their losses as they sell winning stocks more readily. There are many other examples for such asymmetry, such as in bankruptcy prediction [57], behavioral finances [45], expected stock returns [2], criminal justice settings [6], physician prognostic behavior [1], product recommendations [32], and so on. Cost-sensitive regression is still not adequately addressed in …

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tuning Data Mining Methods for Cost-Sensitive Regression: A Study in Loan Charge-Off Forecasting

real-world predictive data mining (classification or regression) problems are often cost sensitive, meaning that different types of prediction errors are not equally costly. While cost-sensitive learning methods for classification problems have been extensively studied recently, cost-sensitive regression has not been adequately addressed in the data mining literature yet. In this paper, we firs...

متن کامل

Utilization of Electric Vehicles for Improvement of Daily Load Factor in the Price-Responsive Environment of Smart Grids

Using electric vehicles, in addition to decreasing the environmental concerns, can play an important role in decreasing the peak and filling the off-peaks of the daily load characteristics. In other words, in smart grids' infrastructure, the load characteristics can be improved by scheduling the charge and discharge process of electric vehicles. In smart grids, the customers are instantaneously...

متن کامل

Utilization of Electric Vehicles for Improvement of Daily Load Factor in the Price-Responsive Environment of Smart Grids

Using electric vehicles, in addition to decreasing the environmental concerns, can play an important role in decreasing the peak and filling the off-peaks of the daily load characteristics. In other words, in smart grids' infrastructure, the load characteristics can be improved by scheduling the charge and discharge process of electric vehicles. In smart grids, the customers are instantaneously...

متن کامل

An Experimental and Comparative Analysis of the Battery Charge Controllers in Off-Grid PV Systems

The study of the battery charge process as the only power storage agent in off-grid systems is of significant importance. The battery charge process has different modes, and the battery in these modes is dependent on the amount of charge. In order to charge the battery in off-grid systems, two charge controllers including Pulse Width Modulation (PWM) and Maximum Power Point Tracker (MPPT) are c...

متن کامل

Forecasting Palladium Price Using GM(1,1)

Palladium is an element of PGM group that has significant physical properties. This leads to more attention to this metal. Due to vast applications of palladium in industry and its usage in jewelry, its price plays an important role in economic. Therefore, forecasting its price is crucial subject in economic and engineering design. This paper proposes the model GM(1,1) to predict the Palladium ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Decision Support Systems

دوره 74  شماره 

صفحات  -

تاریخ انتشار 2015